



| PROPERTIES OF DFA AND NFA TECHNIQUES<br>USED ON CONVENTIONAL MICROPROCESSORS | STORAGE BOUND ON<br># OF STATES (FOR<br>R CHARACTER<br>REGULAR EXPRESSION) | EVALUATION TIME<br>(FOR N BYTES OF INPUT)<br>[ORDER OF] |
|------------------------------------------------------------------------------|----------------------------------------------------------------------------|---------------------------------------------------------|
| DETERMINISTIC FINITE STATE AUTOMATA<br>OR DFA RUNNING ON A GP CPU            | $2^R$<br>(NEEDS VERY LARGE MEMORY)                                         | $N$<br>MEMORY ACCESS CYCLES                             |
| NON-DETERMINISTIC FINITE STATE<br>AUTOMATA NFA RUNNING ON A GP CPU           | $R$                                                                        | $R * N$<br>CPU CACHE+BRANCH CYCLES                      |

FIG. 1A

}

## CPU WALKING DFA TABLE IN DRAM



## COPROCESSOR CLOSER TO TABLE IN SRAM



PERFORMANCE ON EVALUATING REGULAR EXPRESSIONS ON EVERY BYTE OF INPUT STREAM

100s of RES @ 280 Mbps

100S OF MBS OF SRAM

GIGABYTES OF MEMORY

1B  
EIG



FIG. 2

209  
204  
203  
202  
201



**FIG. 3A**



FIG. 3B

209

208

207

206

205

204

203

202

201

200

199

198

197

196

195

194

193

192

191

190

189

188

187

186

185

184

183

182

181

180

179

178

177

176

175

174

173

172

171

170

169

168

167

166

165

164

163

162

161

160

159

158

157

156

155

154

153

152

151

150

149

148

147

146

145

144

143

142

141

140

139

138

137

136

135

134

133

132

131

130

129

128

127

126

125

124

123

122

121

120

119

118

117

116

115

114

113

112

111

110

109

108

107

106

105

104

103

102

101

100

99

98

97

96

95

94

93

92

91

90

89

88

87

86

85

84

83

82

81

80

79

78

77

76

75

74

73

72

71

70

69

68

67

66

65

64

63

62

61

60

59

58

57

56

55

54

53

52

51

50

49

48

47

46

45

44

43

42

41



FIG. 4



FIG. 5



**FIG. 6**

- SIMPLE REGULAR STRUCTURE ENABLES A HIGH DENSITY → DENSE ARRAY OF MULTIPLE TILES
- SEVERAL THOUSANDS OF AUTOMATA (ORGANIZED AS MULTIPLE ROWS OF TILES) CAN FIT ON A SINGLE DIE ON 0.13 $\mu$  TECHNOLOGY



FIG. 7



FIG. 8A



| PROPERTIES OF DFA AND NFA TECHNIQUES USED<br>ON CONVENTIONAL MICROPROCESSORS | STORAGE: BOUND ON<br># OF STATES (FOR R CHARACTERS) | EVALUATION TIME (FOR<br>N BYTES) [ORDER OF]                   |
|------------------------------------------------------------------------------|-----------------------------------------------------|---------------------------------------------------------------|
| DETERMINISTIC FINITE STATE AUTOMATA<br>OR DFA RUNNING ON A GP CPU            | $2^R$<br><br>(NEEDS VERY LARGE MEMORY)              | $N$<br><br>MEMORY ACCESS CYCLES<br>(~100ns)                   |
| NON-DETERMINISTIC FINITE STATE AUTOMATA<br>OR NFA RUNNING ON A GP CPU        | R                                                   | $R * N$<br><br>CPU CACHE+BRANCH<br>CYCLES (~4ns)              |
| NON-DETERMINISTIC FINITE STATE AUTOMATA<br>OR NFA RUNNING ON THE APPARATUS   | R                                                   | $N$<br><br>TIGHT ON CHIP STATE<br>TRANSITION<br>CYCLE (~1 ns) |

FIG. 9A  
(PRIOR ART)

REGULAR EXPRESSION CO-  
PROCESSOR USING EXEMPLARY STATE  
MACHINE ARCHITECTURE

COPROCESSOR CLOSER TO TABLE  
IN SRAM

CPU WALKING DFA TABLE IN DRAM



100s OF RES @ 100 Mbps      1000s OF RES @ 280 Mbps      1000s OF RES @ > 10 Gbps

GIGABYTES OF MEMORY      100s OF MBS OF SRAM      NO TABLE MEMORY NEEDED

TWO ORDERS OF MAGNITUDE SPEEDUP WITHOUT NEED FOR TABLE MEMORY

FIG. 9B